We have already figured out ways of word2vec, LDA, and LSA to analyze words, sentences, documents. But, obviously ,those are not enough for us the understand specific sentence. We’d like to embed a sentence and then we’ll be able to do a bunch of exciting stuffs.
[Read More]
How does Word2Vec embed the word?
Dig out words relationships using Word2Vec
In last posts, we investigate the LSA and LDA to classify the documents. The output results are the embedding of terms and documents, with which we can classify terms and documents using cluster techniques. The LSA or LDA all belong to count-based methods of Vector Space Models. VSMs have a...
[Read More]
How does LDA assign probability of different topics to documents?
Using Bayesian to predict the probabilty of each topic for documents.
Last post, we went through the LSA to get a embed matrix. However, clustering documents shows an obvious shortage, since documents may have multiple themes. This post introduces the LDA which utilizes the Bayesian inference to get the posterior probability of topics in each document, also the posterior probability of...
[Read More]
How does LSA classify documents?
Simplest NLP technique
Image you have a bunch of documents that you’d like to classify based on their topics. The intuitive idea is to pick up keywords of each document, then cluster them into different categories. While you can’t pick them up manually, but you could represent each documents using a matrix. The...
[Read More]